Estimation of the probability distributions of stochastic context-free grammars from the k-best derivations

نویسندگان

  • Joan-Andreu Sánchez
  • José-Miguel Benedí
چکیده

The use of the Inside-Outside (IO) algorithm for the estimation of the probability distributions of Stochastic Context-Free Grammars (SCFGs) in Natural-Language processing is restricted due to the time complexity per iteration and the large number of iterations that it needs to converge. Alternatively, an algorithm based on the Viterbi score (VS) is used. This VS algorithm converges more rapidly, but obtains less competitive models. We describe here a new algorithm that only considers the k-best derivations in the estimation process. The experimental results show that this algorithm achieves faster convergence than the IO and better models than the VS algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning of Stochastic Context-Free Grammars by Means of Estimation Algorithms and Initial Treebank Grammars

The use of the Inside-Outside (IO) algorithm for the estimation of the probability distributions of Stochastic ContextFree Grammars is characterized by the use of all the derivations in the learning process. However, its application in real tasks for Language Modeling is restricted due to the time complexity per iteration and the large number of iterations that it needs to converge. Alternative...

متن کامل

Eecient Disambiguation by Means of Stochastic Tree Substitution Grammars

In Stochastic Tree Substitution Grammars (STSGs), one parse(tree) of an input sentence can be generated by exponentially many derivations ; the probability of a parse is deened as the sum of the probabilities of its derivations. As a result, some methods of Stochastic Context-Free Grammars (SCFGs), e.g. the Viterbi algorithm for nding the most probable parse (MPP) of an input sentence, are not ...

متن کامل

Consistency of Stochastic Context-Free Grammars From Probabilistic Estimation Based on Growth Transformations

An important problem related to the probabilistic estimation of Stochastic Context-Free Grammars (SCFGs) is guaranteeing the consistency of the estimated model. This problem was considered in 3, 14] and studied in 10, 4] for unambiguous SCFGs only, when the probabilistic distributions were estimated by the relative frequencies in a training sample. In this work, we extend this result by proving...

متن کامل

Statistical Properties of Probabilistic Context-Free Grammars

We prove a number of useful results about probabilistic context-free grammars (PCFGs) and their Gibbs representations. We present a method, called the relative weighted frequency method, to assign production probabilities that impose proper PCFG distributions on finite parses. We demonstrate that these distributions have finite entropies. In addition, under the distributions, sizes of parses ha...

متن کامل

Computation of the Probability of the Best Derivation of an Initial Substring from a Stochastic Context-Free Grammar

Recently, Stochastic Context-Free Grammars have been considered important for use in Language Modeling for Automatic Speech Recognition tasks [6, 10]. In [6], Jelinek and Lafferty presented and solved the problem of computation of the probability of initial substring generation by using Stochastic Context-Free Grammars. This paper seeks to apply a Viterbi scheme to achieve the computation of th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998